Decoding the Text Encoding
نویسندگان
چکیده
Word clouds and text visualization is one of the recent most popular and widely used types of visualizations. Despite the attractiveness and simplicity of producing word clouds, they do not provide a thorough visualization for the distribution of the underlying data. Therefore, it is important to redesign word clouds for improving their design choices and to be able to do further statistical analysis on data. In this paper we have proposed a fully automatic redesigning algorithm for word cloud visualization. Our proposed method is able to decode an input word cloud visualization and provides the raw data in the form of a list of (word, value) pairs. To the best of our knowledge our work is the first attempt to extract raw data from word cloud visualization. We have tested our proposed method both qualitatively and quantitatively. The results of our experiments show that our algorithm is able to extract the words and their weights effectively with considerable low error rate.
منابع مشابه
A Novel Patch-Based Digital Signature
In this paper a new patch-based digital signature (DS) is proposed. The proposed approach similar to steganography methods hides the secure message in a host image. However, it uses a patch-based key to encode/decode the data like cryptography approaches. Both the host image and key patches are randomly initialized. The proposed approach consists of encoding and decoding algorithms. The encodin...
متن کاملNearly Tight Bounds on the Encoding Length of the Burrows-Wheeler Transform
In this paper, we present a nearly tight analysis of the encoding length of the Burrows-Wheeler Transform (bwt) that is motivated by the text indexing setting. For a text T of n symbols drawn from an alphabet Σ, our encoding scheme achieves bounds in terms of the hth-order empirical entropy Hh of the text, and takes linear time for encoding and decoding. We also describe a lower bound on the en...
متن کاملFrom Paragraphs to Vectors and Back Again
I investigate some methods of encoding text into vectors and decoding these vector representations. The purpose of decoding vector representations is two fold. Firstly, I could apply unsupervised learning algorithms to the paragraph vectors to find significant ”new” vectors and decode them into paragraphs of text. Effectively, I could process text and generate ”new” ideas. Secondly, I could dec...
متن کاملLempel-Ziv Decoding in External Memory
Simple and fast decoding is one of the main advantages of LZ77-type text encoding used in many popular file compressors such as gzip and 7zip. With the recent introduction of external memory algorithms for Lempel–Ziv factorization there is a need for external memory LZ77 decoding but the standard algorithm makes random accesses to the text and cannot be trivially modified for external memory co...
متن کاملNeural Responding Machine for Short-Text Conversation
We propose Neural Responding Machine (NRM), a neural network-based response generator for Short-Text Conversation. NRM takes the general encoder-decoder framework: it formalizes the generation of response as a decoding process based on the latent representation of the input text, while both encoding and decoding are realized with recurrent neural networks (RNN). The NRM is trained with a large ...
متن کاملDecoding Running Key Ciphers
There has been recent interest in the problem of decoding letter substitution ciphers using techniques inspired by natural language processing. We consider a different type of classical encoding scheme known as the running key cipher, and propose a search solution using Gibbs sampling with a word language model. We evaluate our method on synthetic ciphertexts of different lengths, and find that...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1412.6079 شماره
صفحات -
تاریخ انتشار 2014